03. Metrics
Metrics for Inference Deployment
In 2012, AlexNet was the first DNN architecture to win the ImageNet classification challenge, with a top5 error rate of 15.4%, resoundingly beating the next best entry, which had a 26.2% error rate. Ever more complex and accurate DNN’s have been developed on the ImageNet benchmark dataset ever since, including VGGNet, ResNet, Inception, GoogleNet, and their many variations. The increased accuracy is the result of breakthroughs in design and optimization, but comes at a cost when computation resources are considered.
Analysis
An analysis (Canziani et al, 2016) of state-of-the-art DNN’s, using additional computation metrics, provides insight into design constraints in deployable systems that use DNN’s for inference. Fourteen top architectures were trained on ImageNet, deployed on a Jetson TX1, and compared across the following metrics:
- Top1 Accuracy
- Operations Count
- Network Parameters Count
- Inference Time
- Power Consumption
- Memory Usage
The following table provides a sampling of the results (values are approximated from graphs in the paper), including a derived metric called information density. The information density is a measure of the efficiency of the network, or how much accuracy is provided for every one million parameters that the network requires.
Note that only the results based on a batch size of one are included. In most cases, the batch size provides a speedup in inference time but maintains the same relative performance among architectures. However, an exception is AlexNet, which sees a 3x speedup when going from 1 to 64 images per batch due to weak optimization of its fully connected layer. See the paper for a much more detailed summary!
Conclusions
The Canziani analysis paper concludes with some key insights that are useful when optimizing a deployable robotic system using inference:
- Power consumption is independent of batch size and architecture
- “When full resources utilisation is reached, generally with larger batch sizes, all networks consume roughly an additional 11.8W”
- Accuracy and inference time are in a hyperbolic relationship
- “a little increment in accuracy costs a lot of computational time”
- Energy constraint is an upper bound on the maximum achievable accuracy and model complexity
- “if energy consumption is one of our concerns, for example for battery-powered devices, one can simply choose the slowest architecture which satisfies the application minimum requirements”
- The number of operations is a reliable estimate of the inference time.
Metrics Quiz
SOLUTION:
VGG has an outsized number of parameters compared to the other architectures which directly affects the efficiency calculationResources
- Canziani, Alfredo, Adam Paszke, and Eugenio Culurciello. "An analysis of deep neural network models for practical applications." arXiv preprint arXiv:1605.07678 (2016).
https://arxiv.org/pdf/1605.07678.pdf - Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012. https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
- Adit Deshpande. "The 9 Deep Learning Papers You Need To Know About (Understanding CNNs Part 3)." Adeshpande3.github.io. 17 Dec. 2017. Web. 12 Jan. 2018. https://adeshpande3.github.io/The-9-Deep-Learning-Papers-You-Need-To-Know-About.html
- Dave Gershgorn. "The data that transformed AI research—and possibly the world." Quartz. n.d. Web. 14 Jan. 2018. https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-research-and-possibly-the-world/